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Abstract 


An  implemented  and  operational  model-based  vision  system  is  described. 
Examples  are  given  of  its  interpretation  of  images,  including  extraction  of  three  dimen¬ 
sional  parameters  from  monocular  images.  Advances  are  presented  in  representation  for 
geometric  modeling  of  objects  and  objects  classes,  in  techniques  Tor  manipulating  non¬ 
linear  symbolic  algebraic  constraints,  in  geometric  reasoning  in  incompletely  specified 
situations,  and  in  constructing  algebraic  constraints  from  image  measurements.  Both 
generic  object  classes  and  specific  objects  are  represented  by  volume  models  which  are 
independent  of  viewpoint.  Complex  real  world  object  classes  are  modeled.  Variations 
in  size,  structure  and  spatial  relations  within  object  classes  can  be  modeled.  New  spa¬ 
tial  reasoning  techniques  are  described  which  are  useful  both  for  prediction  within  a 
vision  system,  and  for  planning  within  a  manipulation  system.  New  approaches  to 
prediction  and  interpretation  are  introduced,  based  on  the  propagation  of  symbolic  con¬ 
straints.  Predictions  are  two-pronged.  First,  prediction  graphs  provide  a  coarse  filter  for 
hypothesizing  matches  of  objects  to  image  features.  Second,  prediction  graphs  contain 
instructions  on  how  to  use  measurements  of  image  features  to  deduce  three  dimensional 
information  about  tentative  object  interpretations.  Interpretation  proceeds  by  merging 
local  hypothesized  matches,  subject  to  consistent  derived  implications  about  the  size, 
structure  and  spatial  configuration  of  the  hypothesized  objects.  Prediction,  description 
and  interpretation  proceed  concurrently,  from  coarse  object  subpart  and  class  interpreta¬ 
tions  of  images,  to  fine  distinctions  among  object  subclasses,  and  more  precise  three 
dimensional  quantification  of  objects. 
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This  dissertation  makes  contributions  to  a  number  of  areas  of  research.  They  are 
linked  by  a  central  thesis: 


specified.  Chapter  6  describes  methods  for  handling  complex  geometric  relationships.  It 

also  provides  methods  for  making  deductions  from  the  relationships  between  objects  and  The  second  appendix  (A2)  describes  the  -mrticular  image  .Inscription  proce 


1  A  conversion  pat'Ka^'-  is  under  const  ruction  which  affows  iincfiani’ni  Ackonym 
source  flics  to  run  in  Ukanz  l.isr  under  UNIX  on  a  VAX. 


The  ACRONYM  System 


Volumetric,  model?  and  spatial  relations  are  represented  in  Uie  object  graph. 
Volume  elements  form  the  nodes,  while  spatial  relations  and  subpart  relations  form  the 


of  a  library  of  qualitatively  different  generaliied  cone  model*,  including  both  single  cones 


.  The  ACRONYM  System  15  2.  The  ACRONYM  System 


to  planar  Tarns  of  other  primitives.  The  system  represents  objects  internally 

by  keeping  track  of  the  edges  produced  by  the  intersections  and  unions.  The  I*adi, 


independent  models  should  be  Riven  to  the  system.  The.  resolution  of  the  problem 
of  multiple  appearances  from  multiple  viewpoints  then  becomes  the  responsibility  of 
the  vision  system  itself.  For  a  model  to  be  completely  viewpoint  independent  yet  still 


interpretation  tasks.  In  fact  some  of  the  examples  below  may  seem  to  have  been  carried 
out  successfully  in  spite  of  the  representation  mechanism.  Other  vision  and  modeling 


hypothesized  an  interpretation  of  some  images  features  as  an  instance  of  an  object  with 
Acronym's  volumetric  representation  is  built  around  units  of  class  object  (a  lts  generalized  cone  descriptor,  it  does  not  search  for  subparts  of  the  object  in  the 

unit  s  class  is  given  by  its  class  slot;  this  corresponds  roughly  to  the  self  slot  of  Krl  image. 


Model  Representation  35  1  Modi  I  lfrprvn-tit;titnn 


i.2.3  Variations  in  Spatial  Relationships. 


The  representation  of  articulated  objects  may  be  important  if  manipulator  arms 
are  present  in  images,  and  it  is  desired  to  visually  calibrate  or  servo  them.  Soroka’s  [55] 
simulator  is  based  on  these  representations. 


two  examples  of  variable  camera  geometries.  If  the  characteristics  of  the  imaging  camera  The  user  specifies  part  of  the  restnetion  graph  to  the  system.  Other  part,  are 

arc  not  knewn  exactly,  the  focal-ratio  slot  can  be  filled  with  a  quantifier  rather  than  added  by  Acronym  while  carrying  out  image  understanding  tasks.  By  contrast  the 

a  number.  Any  image  interpretation  will  provide  information  which  can  be  used  to  object  graph  is  completely  specified  by  the  user,  perhaps  from  a  cad  data-ba«e,  and 

constrain  this  quantifier  (see  chapter  8  for  how  this  comes  about).  remains  stat.c  dunng  image  interpretation.  Eventually  Acronym  may  be  able  to  build 

from  examples,  using  techniques  of  Nevatia  and  Binford  [47). 


of  it  being  necessary  to  recompute  image  to  model  correspondences  for  the  specialized 

_ _  model,  it  suffices  to  simply  take  the  meet  of  the  specialization  restriction  node  with  a 

BASE  restriction  restriction  node  produced  in  the  orginal  interpretation.  If  the  resultant  restriction  node 


volumetric  description  among  all  its  specializations.  There  arc  never  multiple  copies  of 
fragments  of  the  volume  model.  The  specialization  information  is  in  a  domain  orthogonal 
to  the  underlying  representation.  It  is  therefore  compart.  More  importantly,  during 
image  interpretation,  when  an  instance  of  a  superclass  has  been  identified  it  is  rather  easy 
to  (berk  whether  it  happens  to  also  bo  an  instance  of  a  more  specialized  class.  Instead 


values  for  the  quantifiers.  This  is  quite  different  from  the  tasks  required  of  other  cmSs.  tion  and  interpretation  algorithms,  independent  of  the  heuristic  power  which  is  required 


„)•',  White  UPPEM;)  constructs  “minfoo.  200  -  x/y)»  which  gets  s,mpl,fied  to  "200  -  requirements  of  ^2  of  section  5.2.  They  are  monotonic  also.  The  partia!  decs, on 

*/V"  ThCSC  definitions  of  UPPER  and  LOWER  closely  follow  those  used  by  Bledsoe  [13]  and  procedure  is  based  on  these  algorithms  (see  sect, on  5.4.3). 

Shostak  [53].  They  did  not  use  HIVAL  and  LOVAL. 


.  Constraint  Manipulation 


C'<w>l.'aint  Manipu/ation  59  5  Constraint  Manipu/ation 


First  note  that  algorithms  SUPPP  and  INFFF  terminate,  since  al!  recursive  calls 
reduce  the  number  of  symbols  in  their  first  argument,  and  they  exit  ‘■imply  when  the 


The  quantifiers  are  well  constrained  by 


Ambler  and  Popplestone  [4]  assume  they  are  given  a  description  of  a  goal  state  of  (chapter  f>)  and  the  Acronym  geometric  simplifier  described  below  are  together  able  to 

spatial  relationships  between  a  set  of  objects,  such  as  "against.'*  and  "fits",  and  describe  make  stronger  deductions  than  those  described  by  McDermott. 


6.2  Geometric  Reasoning  in  ACRONYM. 


algebraic  simplifier  they  use  does  not  produce  a  canonical  form.  It  is  inherent  in  the 

he  made  of  a  single  rotation  and  translation  expression,  where  the  axis  and  magnitude 

methods  themselves.  Similar  arguments  to  those  of  tif  Kleer  and  Sussman  [34]  apply 

of  the  rotation  are  both  complex  trigonometric  forms. 

to  this  case  also.  If  the  mechanisms  which  use  the  simplified  geometric  expressions  are 


onent  of  a  coordinate  transform. 


I 


products  ui  the  terms  of  the  sum.  To  simplify  the  final  translation  expression,  rules  SR7, 


T.3  Observable  Feature  Prediction. 


scaled  according  to  the  distance  of  the  object  from  the  camera.  Examples  of  why  this  major  axis  then  another  useful  rule  could  come  into  play.  From  expression  (7.2)  it  would 

is  useful  are  given  below.  The  perspective-normal  projection  in  Acronym  is  further  deduce  that  in  the  image  he  major  axis  of  the  ellipse  will  be  normal  to  the  spine  of  the 

simplified  by  using  the  a  camera  coordinate  of  the  origin  of  the  cone  coordinate  frame,  ribbon.)  Later  in  the  prediction  it  is  decided  that  the  ellipse  corresponding  to  the  top  of 

rather  than  a*  as  defined  above.  the  screwdriver  tool  will  actually  be  occluded  (as  described  in  section  6.2),  but  that  need 

not  concern  us  here. 


that  the  cylinder  will  appear  as  a  ribbon  generated  by  its  swept  surface,  and  an  ellipse  level  descriptive  processes  [20]  which  search  the  image  for  candidate  shapes  to  be  matched 

generated  by  its  initial  cross  section.  Furthermore  they  will  be  connected  in  the  image.  to  predictions.  Given  that  the  focal  ratio  is  2.42  and  the  length  of  the  screwdriver  tool  is 

(If  the  descriptive  process  which  found  ellipses  were  able  to  accurately  determine  their  1,  and  using  the  expanded  z  component  of  (6.4),  the  algebraically  simplified  prediction 


where  2.42  is  the  focal  ratio  of  the  camera  and  TOOL .  CA14Z  is  an  internal  quantifier  shape  feature  matched.  Each  provides  a  number  of  such  back  constraints  which  combine 

generated  by  the  prediction  module.  to  further  constrain  the  individual  parameters. 


1 


If  the  angle  between  the  spines  of  two  generalised  cones  as  viewed  from  the 


rectangular  cross-section,  and  each  of  the  ribbons  generated  by  the  two  visible  swept 


of  the  interpretation.  It  has  all  the  back  constraints  added  to  it  which  are  generated  by 
hypothesizing  the  match  (see  section  7 A  for  details  of  back  constraints).  Each  complete 


than  complete  graph  embeddings,  does  lead  to  incorrect  image  interpretation*.  The  models.  However  this  information  is  aclually  implicitly  avarlable  elsewhere  and  so  a 

constraint  system  allows  for  such  relaxed  matching,  but  still  provides  a  mechanism  for  new  scheme  was  developed,  whereby  the  system  decides  itself  from  class  rather  than 


A  large  correct  interpretation  graph  has  associated  with  it  a  restriction  node 
which  specialises  both  object  models  and  their  spatial  relations  to  the  three  dimensional 
understanding  of  the  world  derived  from  the  feature  prediction  hypothesized  matches 


overcome  address  spare  limitiations.  Unfortunately  there  is  still  very  little  working  space 
left  in  the  primary  fork,  so  that  the  garbage  collector  must  constantly  be  at  work  and 
Acronym  usually  runs  out  of  address  space  before  completing  an  example.  One  solution 
would  be  to  try  to  partition  the  primary  fork  again.  However,  coupling  across  forks  is 
by  way  of  files  and  so  it  is  necessary  that  the  modules  in  different  forks  are  able  to  work 


efficier- '/  with  only  occasional  interactions  with  the  other  fork.  Otherwise  the  system 
IS  slower  in  wall  clock  time  by  a  few  orders  of  magnitude.  Since  in  the  primary  fork, 


degradation  of  image  information  at  this  stage.  This  is  the  only  data  which  the  Acronym  Once  a  consistent  match  or  partial  match  to  a  geometric  model  has  been  found  in 

reasoning  system  is  given  to  interpret.  Notice  that  in  the  figure  9.6,  almost  all  the  shapes  the  context  of  some  set  of  constraints  (mode)  class),  it  easy  to  check  whether  it  might  also 

corresponding  to  aircraft  are  lost.  Quite  a  few  aircraft  in  9.4  are  lost  also.  Besides  losing  be  an  instance  of  a  subclass.  It  is  only  necessary  to  add  the  extra  constraints  associated 

many  shapes,  the  combination  of  the  edge  finder  and  edge  linker  conspire  to  give  very  with  the  subclass  and  check  for  consistency  with  those  constraints  already  implied  by 

inaccurate  image  measurements.  All  image  measurements  are  assumed  to  have  a  ±30%  the  interpretation  using  the  cms  as  described  in  chapter  8.  The  aircraft  located  in  9.5<f 
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The  following  code  defines  the  subpart  tree  and  attaches  the  names  of  generalized  parameterizable  in  a  single  variable.  The  parameter  is  preccedcd  by  the  dummy  word 

cone  descriptors  to  objects  which  have  them.  The  function  “spq"  defines  a  unit  which  “over"  and  ranges  over  the  integers  from  1  up  to  the  expression  proceeded  by  the  dummy 

describes  a  quantified  number  of  subparts  r  the  same  geometric  class.  Its  first  argument  word  “uptc”. 
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Now  we  add  the  constraints  for  the  sub  class  BCEINC-747  and  its  two  sub-cJasses 

(constrain  BOEING-747SP  with 

B0EING-747B  and  B0EING-747SP.  <*:  wing-attachment  27  46) 

TT7  consifTTnts  for  60flNG-747 - ~ -  ('  EUSELAGE-LENGTH  5E.0))  .maOe  up! 

.  not  qu i te  specif ic 


(user-var  iat)le  BASF  -  LENGTH 
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(constrain  example  *1tn 

( C* interval  SCREW-W0881E-X  (*•  -2  DEGREES)  (•  2  DEGREES)) 

( c- Interval  SCREW-W088LE - 2  (•  -2  DEGREES)  (•;  2  DEGREES))) 


The  Pemm  is  guided  by  a  PEMM-program.  It  consists  of  three  sets  of  heuristics 
number  of  parameter  values.  In  Acronym  the  prediction  algorithms  produce  a 
-program  simultaneously  as  they  produce  the  prediction  graph.  The  first  section 


The  Ribbon  and  Eliispr  Finder.  There  are  four  issues  to  resolve  for  suet  u  tree  search. 


the  PEMM-program)  fully  developed  at  the  head  of  the  list.  The  result  of  the  search  is  rontour.  A  candidate  contour  is  retained  only  if  it  and  its  associated  direction  satisfy  all 

the  edge  list  associated  with  the  node  at  the  head  of  the  list,  wl-  '  h  will  be  the  highest  the  predicates  in  cullers.  Note  that  these  predicates  too  may  he  parameterized  by  global 

scoring  node  found  during  the  search.  Note  that  a  node  can  never  be  promoted  on  the  variables  set  by  the  Pi'.MM-program.  Also,  the  ordering  of  cullers  cannot  affect  the  final 
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?v«r  (inc  ?)  -4  7-free  subtree  (includes  atoms) 


unifies  with,  for  example: 


( • -PROGM  -T)) 

BAZOLA  (.-SUBGOAl  '(AOD-BACIC  ,QP  ,(»S:  -#|.o  HI  C  Xi> )  ) ) 

l.-SUBGOAL  (AOD-BACIC  .QP  i(M4  .HI  LOf«P))) 


